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Abstract. Traditionally, the Method of (Shannon-Kullback's) Relative Entropy Maximization (REM) is 
considered with linear moment constraints. In this work, the method is studied under frequency moment 
constraints which are non-linear in probabilities. The constraints challenge some justifications of REM since a) 
axiomatic systems are developed for classical linear moment constraints, b) the feasible set of distributions which 
is defined by frequency moment constraints admits several entropy maximizing distributions (I-projections), 
hence probabilistic justification of REM via Conditioned Weak Law of Large Numbers cannot be invoked. 
However, REM is not left completely unjustified in this setting, since Entropy Concentration Theorem and 
Maximum Probability Theorem can be applied. 

Maximum Renyi/Tsallis' entropy method (maxTent) enters this work because of non-linearity of X-frequency 
moment constraints which are used in Non-extensive Thermodynamics. It is shown here that under X-frequency 
moment constraints maxTent distribution can be unique and different than the I-projection. This implies that 
maxTent does not choose the most probable distribution and that the maxTent distribution is asymptotically 
conditionally improbable. What are adherents of maxTent accomplishing when they maximize Renyi's or Tsallis' 
entropy? 



1 INTRODUCTION 

Let TT be a set of empirical probability mass func- 
tions (types) which are defined on m-element sup- 
port and which can be based on random samples of 
size n. Let the supposed source of the types be prob- 
abihty mass function q . A problem (from category of 
iU-posed inverse problems) of recovering probability 
distribution from TT amounts to selection of type(s) 
from n, in particular when n — > oo. 

The problem (called hereafter Boltzmann-Jaynes 
Inverse Problem, BJIP) can be met in many 
branches of science, ranging from Statistical Physics 
(where it originated) to Computer Tomography. Sev- 
eral approaches to the problem can be found in the 
literature. While most of them are tailored to needs 
of the particular branch of science, the method of 
(Shannon-Kullback's) Relative Entropy Maximiza- 
tion (REM) is considered as the general solution to 
the problem by mathematicians. Arguments which 
justify application of REM for selection of distribu- 
tion from n in BJIP range from axiomatic, through 
probabilistic and game-theoretic to pragmatic, and 
others. As rule, in order to be valid they put certain 



requirements on FI and n. 

So far, most of the REM-justifying work concen- 
trated on the case of Tl defined by the usual linear 
moment constraints. Such H possesses the attractive 
property of convexity, which thanks to concavity of 
the Shannon-Kullback's entropy implies uniqueness 
of REM-selected distribution (called I-projection of 
q on n, in the Information Theory). Linearity of the 
constraints lays behind the well-known exponential- 
ity of the I-projection. 

As [35] indicates, so-called frequency moment con- 
straints appear rather naturally in several places 
in Physics. Frequency moments are non-linear in 
probabilities and the feasible set TTf which they de- 
fine is non-convex. Due to the non-linearity and 
a symmetry of the constraints, there are multi- 
ple I-projections of q on TTf. The non-linearity of 
moments, non-convexity of the feasible set, non- 
exponentiality of recovered distribution and its non- 
uniqueness challenge several justifications of REM. 
Two of the most widely employed REM-justifying 
arguments: axiomatizations and Conditioned Weak 
Law of Large Numbers cannot be invoked in this set- 
ting since axiomatic systems are developed for lin- 



ear constraints and CWLLN requires assumption of 
uniqueness of I-projection. Is there then any reason 
to select the most entropic distribution from TTf? 
Yes, since Entropy Concentration Theorem (ECT) 
and Maximum Probability Theorem (MPT) can be 
readily used to justify MaxEnt also in this case. 
Though MPT was originally stated with unique I- 
projection in mind, the Theorem can be instantly 
extended also to the case of multiple l-projcctions. 

The frequency moment constraints can be viewed 
as a special case of Tsallis' (cf. [37]) or MNNP (cf. 
[39], [32]) constraints which are used in 'hot topic' 
Non-extensive Thermodynamics (NET). The con- 
straints are as well non-linear in probabilities. NET 
has arisen from Tsallis' prescription to select from 
set which the constraints define such a distribution 
which maximizes Tsallis' entropy. Thus, in this area 
REM was displaced (or generalized, if you wish) by 
maximization of Tsallis' entropy. Besides axiomatic 
justifications (which arc based on extensions of those 
of REM) and declared success of maxTent in mod- 
eling power-law phenomena (which allegedly REM 
cannot model), there is however yet no probabilistic 
justification of the method. 

The paper is organized as follows: First, the nec- 
essary terminology and notation is set down. Then 
probabilistic justifications of REM: CWLLN, ECT 
and MPT are reviewed from perspective of their 
applicability in the case of multiple I-projections. 
Maximum Probability Theorem is stated in the gen- 
eral form which covers the situation of multiple l- 
projections. Also, applicability of other justifications 
is briefly discussed. Next we turn to the simplest 
of non-linear moment constraints: frequency mo- 
ment constraints and note that I-projection on ITf 
is non-unique and non-exponential. Frequency mo- 
ments constraints arc then used to provide an il- 
lustration for the general form of Maximum Proba- 
bility Theorem. Next, Tsallis' and Renyi's entropies 
are introduced, and it is noted that under frequency 
moment constraints maximization of Renyi-Tsallis' 
entropy (maxTent) selects no distribution. Under 
MNNP constraints it does, but as it will be shown, 
the maxTent-selected distribution can be unique but 
different than the 1-projection. Consequences of this 
finding for maxTent are discussed. Concluding com- 
ments sum up the paper and point to further con- 
siderations. Appendix describes a method for finding 
I-projections on Hf . 



2 TERMINOLOGY AND 
NOTATION 

Let X = {xi ,X2, . . . ,Xra} bc a discrete finite set called 
support, with m elements and let {Xi,,I = 1,2,... ,n} 
be a sequence of size n of identically and indepen- 
dently drawn random variables taking values in X. 

A type "V = [ni ,n2, . . . ,nTa]/Tx is an empirical 
probability mass function which can be based on se- 
quence {Xi, I = 1 ,2, . . . ,n}. Thus, rii, denotes number 
of occurrences of i-th element of X in the sequence. 

Let J'(X) be a set of all probability mass functions 
(pmf's) on X. Let HC ?(X). 

Let the supposed source of the sequences (and 
hence also of types) be q e J'(X), called (prior) 
generator. 

Let 7r("v) denote the probability that q will gener- 
ate type ^f, ie. 7t(^) = ^,<n^'...n^i Ur=^ ^i' ■ Then, 
7t("v € A] denotes the probability that q will gen- 
erate a type ^f which belongs to ACT], ie. 7t('v e 
A) = 2I^e/i "^i^)- Finally, let n{-v e Aly e IT) denote 
the conditional probability that if q generates type 
"V G n then the type belongs to A. It is assumed that 
the conditional probability exists. 

I-projection P of q on set IT C J'(X) is such 
^ e n that 1(^11 q) = infpenl(p||q), where^ 
I(p||q) — XxPi-l'-'S^ I-divergence. I- 

divergence is known under various other names: 
KuUback-Leibler's distance, KL number, Kullback's 
directed divergence, etc. When taken with minus 
sign it is known as (Shannon-Kullback's) relative 
entropy. 

General framework of this work is established by 
Boltzmann-Jaynes inverse problem (BJIP)^: 

Let there be a set TT C 3'(X) of types which are 
defined on vn-element support X and which can be 
based on random samples of size n. Let the supposed 
source of the random samples (and thus also types) 
be q. BJIP amounts to selection of specific type(s) 
from n when information {X,n, q,TI} is supplied. 

Example 1: Let n = 6, X = [1 2 3], q = 
[1/3 1/3 1/3] and let the feasible set com- 
prise all such types which have probability of 
one of the support-points equal to 2/3, ie. IT = 
{ [2/3 1 /6 1 /6] ,[2/31/30],...} where the dots stand 
for the remaining 7 permutations of the two listed 
types. Given the information {n,X,q,n} the BJIP 
task is to select a type from the set TT. 



^ There, log = — oo , log = +oo , • (±oo ) = 0, conventions 
are assumed. Throughout the paper log denotes the natural 
logarithm. 

^ Equivalently the framework could be phrased as a problem 
of induction (or updating), cf. [21]. 



If n contains more than one type (as it is the case 
in the above Example), the BJIP becomes under- 
determined and in this sense ill-posed. 

3 JUSTIFICATIONS OF REM 

3.1 Conditioned Weak Law of Large 
Numbers 

A result of the Method of Types, which was devel- 
oped in the Information Theory (cf. [11]), provides 
a probabilistic justification for application of REM 
method for solving BJIP, when n tends to infinity 
and IT has certain properties. The result is usually 
known as Conditioned Weak Law of Large Numbers 
(CWLLN), or as Gibbs conditioning principle (in 
large deviations literature, see [14], [12]). The ar- 
gument shows (loosely speaking) that any type from 
n which is generated by qj and is not close (in Li- 
norm) to the 1-projcction of q on IT becomes condi- 
tionally improbable to come from q as sample size 
grows large. To establish this result (cf. [45], [44], 
[22], [9], [7], [28], [29], [30]) assumption of unique- 
ness of I-projection is needed. 

(CWLLN) Let ^ he unique l-projection of q on 
IT. Let q ^ IT. Then for any e > 

lim 7t(|Ti-'Pi| > e I A'G IT) =0 i.= l,2,...,ra 

n— >oo 

(1) 

Well-studied is the case of closed, convex IT, which 
ensures uniqueness of I-projcction, provided that it 
exists (cf. [9], and [28], [29], [30] for further devel- 
opments). As it is well-known, in this case the I- 
projcction belongs to the exponential family of dis- 
tributions (see [8]). 



3.3 Maximum Probability Theorem 

Maximum Probability Theorem (MPT), which 
was originally (see [16], Thm 1.) stated with unique 
I-projection in mind, claims that the type in IT 

which the (prior) generator q can generate with the 
highest probability converges to the I-projection of 
q on n, as n ^ GO. However proof of the Theorem 
(cf. [16]) covers more general situation of multiple 
I-projections and thus allows to state MPT in the 
following general form: 

(MPT) Let q be a generator. Let differentiable 
constraint = define feasible set of types Un Q 
n and let IT = {p : F(p) = 0} be the corresponding 
feasible set of probability mass functions. Let = 
arg max-vgrin ^("v), j = 1 ,2, ... ,1, be types which have 
among all types from TT-n the highest probability of 
coming from the generator q. Let there be k I- 
projections ,'^2, • • • o/q on IT. And fe<n— > oo. 
Then l = k and "v^j — f^j for j = 1 ,2, . . . ,k. 

It should be noted that MPT argument implies 
that REM is only a special, asymptotic form of 
simple and self-evident method (called Maximum 
Probability method (MaxProb) at [16]) which seeks 
in IT such types which the generator q can generate 
with the highest probability. Thus applicability of 
REM in BJIP is inherently limited to the case of 
sufficiently large n. 

Also, it is worth noting that a bayesian interpre- 
tation can be given to MaxProb, which thanks to 
MPT carries over into REM/MaxEnt (cf. [21]). 

Prom the perspective of the current work, it is 
important that the MPT holds also when the feasible 
set admits multiple types with the highest value of 
the probability fb. An illustration of the convergence 
of most probable types to I-projections will be given 
in the Section 4, where such a set is determined by 
frequency moment constraints. 



3.2 Entropy Concentration Theorem 

Without the assumption of uniqueness of the I- 
projection, a claim known as the Entropy Concen- 
tration Theorem (ECT), weaker than (1), can be still 
made (see [7]): 

(ECT) Let n C J>{X) be nonempty. Let t be such 
that I < I("v|| q) for any "v e IT. Then for any e > 

lim 7t(|l(^||q)-t| < el^en) =1 (2) 

Assumption (of whatever form) which guarantees 
existence and uniqueness of the I-projection is cru- 
cial for coming from statement (2) to the stronger 
claim (1). 



3.4 Axiomatic systems 

Besides the probabilistic arguments several ax- 
iomatic approaches were developed to support max- 
imization of Shannon's entropy or relative entropy 
as the only logically consistent method for solving 
BJIP"^. However, it should be noted that maximiza- 
tion of Renyi's entropy was as well found to sat- 
isfy some of the axiomatic systems, which had been 



^ Strictly speaking, the axiomatizations assume BJIP with 
either n. unknown or n bigger than any Umit. They seem to 
be inappropriate for BJIP with finite sample size. 



developed to justify REM (see [42]). For purposes 
of the presented work it is sufBcient to note that 
the axiomatic system (cf. [10]) which is perhaps the 
most widefy accepted requires assumption of Unear- 
ity of the constraints (or, in general, convexity of 
TT). A non-axiomatic argument based on potential- 
probability density relationship and a complemen- 
tarity (cf. [18]) is restricted to the linear constraints 
as well. Also a gamc-thcorctic view of REM (see [23]) 
assumes the linear constraints. 

To sum up: When TT admits several I-projections 
the justifications of REM which are readily avail- 
able reduce to Entropy Concentration Theorem and 
Max;imum Probability Theorem. 

4 FREQUENCY MOMENT 
CONSTRAINTS 

This study was triggered by an interesting paper by 

Romera, Angulo and Dehesa (cf. [35]) on frequency 
moment problem. There also links to statistical con- 
siderations of the frequency moments as well as to 
their applications in Physics can be found. 

Tn the simplest case of single frequency moment 
constraint, feasible set of types is defined as TTf = {p : 

I.^=^Vi-cL = 0,Y.^=^Vi-'i =0], where a,aeR. 
If m > 2, the problem of selection of type becomes 
ill-poscd. Note that the first constraint is for a 7^ 1 
non-linear in p and TTf is non-convex. 

4.1 I-projection: non-uniqueness and 
non-exponentiality 

It is straightforward to observe that I-projection 
of q on TTf possesses a symmetry, in the sense that 
if certain P is I-projection of q on TTf then any 
permutation of the vector P should necessarily be 
also I-projection. 

Within this Section q will be assumed uniform (for 
a reason which is implied by discussion at Section 
5.1), denoted u. Note that when uniform generator 
is assumed, the method of Relative Entropy Max- 
imization reduces to Maximum Shannon's Entropy 
method (abbreviated usually MaxEnt). 

The non-convexity of feasible set makes the prob- 
lem of maximization of Shannon's entropy analyti- 
cally unsolvable. Critical value of pi is expressed as: 
Pi(A) = klAje-^^PJ*"' , where k(A) ^ ]^e-^'^vr\ 
Note that the expression is explicitly self-referential. 

Thus, the 1-projections should be searched out 
either numerically or by a method which is described 
at the Appendix. 



4.2 MaxProb justification of REM: multiple 
I-projections 

That the most probable types indeed converge 
to the corresponding 1-projcctions as the general 
form Maximum Probability Theorem states will be 
illustrated by the following Example. 

Example 2: Let 01 = 2, X — [^ 2 3], m = 3 and 
a = 0.42 (the value was obtained for p = [0.5 0.4 0.1]). 

For n= 10,30,330,1000,2000 the feasible sets Tlf 
were constructed. For example, TTfjo contains "v = 
[5 4 1]/10 and all its permutations (ie. [5 1 4]/10, 
etc). This will be called group of types. TTf ^30 con- 
tains two groups: [15 12 3]/30 and [17 8 5]/30. 
The last one has higher probability of coming from 
uniform prior generator. For n — 330 the feasi- 
ble set comprises groups [0.0939 0.4333 0.4727], 
[0.5666 0.2666 0.1666], [0.1 0.4 0.5] and the group 
[0.1939 0.2333 0.5727], which has the highest prob- 
ability of being generated by u. 

For each u, among the feasible types, the most 
probable 'v> which could be drawn from the uniform 
prior generator was picked up. They are stated at the 
Table 1 together with a corresponding I-projection 
of u on TTf. 

TABLE 1. The most probable 



typo, for 


growing 


n. 




n 








10 


0.1 


0.4 


0.5 


30 


0.166 


0.266 


0.566 


330 


0.1939 


0.2333 


0.5727 


1000 


0.1990 


0.2280 


0.5730 


2000 


0.2080 


0.2185 


0.5735 




0.2131 


0.2131 


0.5737 



Clearly, the most probable type (hence also the 
whole permutation group of 6 most probable types) 
converges to the pmf (permutation group of 3 pmf 's) 
which maximizes Shannon's entropy. <0 



4.3 maxTent: no selection 

At this point, both Rcnyi's and Tsallis' entropies 
will be introduced. Renyi's entropy (cf. [36], [34]) is 
defined as Hr (p) ^ ^ log (^J^, pf ) , where a e R, 
aT^I. 

Tsallis' entropy Ht (cf. [24], [43], [37]) is hnear ap- 
proximation of Renyi's entropy: Ht (p) = — ' 
where a G R, a 7^ 1 . 

Renyi's entropy attains its maximum at the same 
pmf as does Tsallis' entropy. Thus, hereafter max- 
Tent will denote both method of maximum Renyi's 
and Tsallis' entropy at once. maxTent will be dis- 



cussed in greater detail in Section 5. Here it suf- 
fices to note that in the set TTf which is defined by 
the frequency moment constraint each type has the 
same value of Renyi's (or Tsallis') entropy. In other 
words, maxTent refuses to make a choice from TTf. 
Recall that MaxEnt selects l-projcctions, and ECT 
implies that types conditionally concentrate on the 
T-projections in such a way, that as n gets large 
there is virtually no chance to find a type which has 
value of Shannon's entropy different than the maxi- 
mal one. MPT complements it by stating that most 
probable types turn into the T-projections, as n goes 
to infinity. 

5 X-FREQUENCY MOMENT 
CONSTRAINTS 

Frequency moment constraints can be viewed as a 
special case of non-linear constraints which were 
originally introduced into Statistical Mechanics by 
Tsallis (see [37]). Tsallis' constraints define feasi- 
ble set TTt as follows: TTy = {p : Hj^i pf Xt — a = 

Tsallis' constraints were for Physics reasons su- 
perseded by TMP constraints (see [39]). Later on, 
the TMP constraints were rearranged by Martinez, 
Nicolas, Pennini and Plastino [32] in MNPP form 
which allows for simpler analytic tractability. The 
TMP constraints in MNNP form specify feasible set 

as 4 {p : Vti^-^) = 0> Lx" 1 Pi - 1 = 0}. A 
probability mass function (pmf) from TTt^ at which 
Tsallis' (or Renyi's) entropy attains its maximum 
will be called x-projection. 

Since an argument which is presented at Section 
5.4 is valid both for Tsallis' constraints and MNNP 
constraints, both they will be referred hereafter as 
X- frequency moment constraints. 



5.1 maxTent: backward compatibility with 
MaxEnt 

Non-extensive Thermodynamics (NET) pre- 
scribes to use maximization of Tsallis' entropy for 
the pmf selection when the feasible set is defined 
by X- frequency constraints. As it was already men- 
tioned, the distributions selected by maximization 
of Tsallis' entropy is the same as that by Renyi's 
entropy maximization. Though it is not our con- 
cern here, for completeness it should be noted 
that Renyi's entropy is extensive (additive) whilst 
Tsallis' one is not, and that the 'world according 



to Renyi' has different properties than the 'world 
according to TsalUs' (see [26]). 

Maximization of Renyi-Tsallis' entropy under X- 
frequency constraints satisfies the elementary re- 
quirement of backward compatibility with MaxEnt: 
when X-frcquency constraints reduce to the classic 
linear moment constraints, the Tsallis' entropy re- 
duces to Shannon's one (it happens for 1). In 
relation to this, it should be noted that maximiza- 
tion of Shannon's entropy is from the point of view 
of probabilistic justifications just a special case (uni- 
form q) of Relative Entropy Maximization. However 
no relative form of Tsallis' entropy was yet consid- 
ered by adherents of NET. For this reason in our con- 
siderations general prior distribution q is replaced 
by uniform one, U. 

5.2 MaxEnt: non-exponentiality 
maxTent: power law 

Maximization of Shannon's entropy under MNNP 
form of X-frequcncy moment constraints by La- 
grange multiplier technique leads to pmf which is 
of implicit and self-referential form, only: Pi(A) oc 
g-^aCxi-bjpf Whether it is the T-projection and 
whether it is unique cannot be analytically assessed. 

Under MNNP constraints, maximization of Renyi- 
Tsallis' entropy by means of Lagrangean leads to 
the first order conditions for extremum which are 
solved by a pmf of power-law form: Pi(A) oc (1 -|- 
Axi(cx-l))^/(^"") (see [32]). It is important to note, 
that the candidate pmf could be a (local/global) 
maximum only if a > and if 1 -|- Axi(a— 1 ) > for 
all i = 1 ,2, . . . ,m. The latter requirement, known as 
Tsallis' cut-off condition, should be checked on the 
case-by-case basis. 

5.3 Generalized entropies and BJIP 

Non-shannonian forms of entropies have been 
around for long time. Some of them fall into category 
of convex statistical distances, and their mathemati- 
cal properties are well-studied (cf. [31]). Also, exten- 
sions and modifications of ax;iomatic systems which 
lead to non-shannonian entropies were studied (see 
[3]). Some of the 'new' entropies were foimd useful, 
some not (cf. [2]). As far as Renyi's entropy is con- 
cerned few its 'operational characterizations' were 
developed in the Information Theory (cf. [1] and lit- 
erature cited therein) . Little seems to be known how- 
ever about its probabilistic justification in context 
of the ill-posed inverse problems. In particular, it is 



not known what is the probabiUstic question that 
maxTent answers. Neither it is known, whether the 
unknown question which maxTent answers is mean- 
ingful to ask within the context of B JIP. 



5.4 MaxEnt vs. maxTent 

niaxTcnt method is by adherents of NET pre- 
sented as a generahzation of MaxEnt. The general- 
ization extends MaxEnt in two directions: Shannon's 
entropy is generalized into the Tsallis' entropy, and 
the traditional linear moment constraints are gen- 
eralized into non-linear either Tsallis' constraints or 
MNNP constraints. Though there can be no objec- 
tion made to generalization of constraints, rather 
vague arguments (see for instance Introduction of 
[40]) were advanced to explain why maximization of 
Shannon's entropy should be under the X-frequency 
constraints replaced by maximization of Tsallis' en- 
tropy to select a distribution from the feasible set 
which the constraints define. 

Conditioned Weak Law of Large Numbers (or 
Gibbs conditioning principle), Entropy Concentra- 
tion Theorem and Maximum Probability Theo- 
rem provide probabilistic justification of REM (and 
hence also of MaxEnt) method (though adherents 
of maxTent might failed to note it, see [41]). As 
it was discussed here, ECT and MPT can be read- 
ily used also under any non-linear constraints, and 
hence the two Theorems give justification to applica- 
tion of REM/MaxEnt also under Tsallis' or MNNP 
constraints. Thus, when n is sufficiently large (which 
is indeed the case in Statistical Mechanics), anybody 
who chooses from the feasible set which is defined 
by say MNNP constraints the I-projection(s) can be 
sure 1) that (any of) the 1-projection is just such a 
type in the feasible set which can be drawn from q 
with the highest probability when n goes to infinity 
(recall MPT), and moreover that 2) any type which 
has not value of the relative entropy close to the 
maximal value which is attainable within the fea- 
sible set is asymptotically conditionally improbable 
(recall ECT).' 

In an interesting paper [27] which for the first time 
exposed maxTent to a criticism from a probabilistic 
point of view. La Cour and Schieve derived neces- 
sary conditions for agreement of 1- and T-projections 
under MNNP constraints. Also, the authors illus- 
trated by means of specific example (a = 1 /2, m = 3, 
X = [1 2 3] and a = 7/11) that x-projection can 
be different than I-projection. Provided that the I- 
projection is unique, one can safely recall CWLLN 
to conclude that maxTent-selected T-projection on 



TTt: is asymptotically conditionally improbable. How- 
ever, the issue of uniqueness or non-uniqueness of I- 
projection on ITx is to the best of our knowledge not 
settled yet. 

A different argument is used here to show that 
maxTent can select asymptotically conditionally im- 
probable distribution under X-frequency constraints. 
The argument is based on observation that by a 
choice of support points of the random variable X 
the feasible set of distributions TTx can be made con- 
vex (the same can be done with ITt). Convexity of 
TTt guarantees uniqueness of I-projection. Provided 
that a > (which implies concavity of Tsallis' en- 
tropy) the T-projection on the convex 11^ is as well 
unique. Both I-projection and T-projection can be 
then found out by straightforward analytic maxi- 
mization. Since the two are (except of trivial cases) 
different, CWLLN implies that the one chosen by 
maxTent has asymptotically zero conditional prob- 
ability. 

The next Example illustrates the argument. 

Example 3: Let Tlx = {p : Pi ("t — b) = 
O.HLiPi"'' =0}. Let X= [-2 1]andletb = 0. 
Then 0^- = {p : pf = 2pf , - 1 = 0} which effec- 
tively reduces to TTt = {p : P2 = 1 — Pi (1 + V2),P3 = 
^/2p^}. Prior generator q is assumed to be uniform 
u. 

The feasible set TIt- is convex. Thus 1- 
projection ^ of u on Tlx is unique, and can be 
found by direct analytic maximization to be 
P = [0.2748 0.3366 0.3886]. Straightforward max- 
imization of Rcnyi- Tsallis' entropy lead to unique 
T-projection Pj = [0.2735 0.3398 0.3867], which is 
different than ^. 

The finding that T-projection can be asymptoti- 
cally conditionally improbable prompts Jaynes ques- 
tion: What are adherents of maxTent accomplishing 
when they max;imize Renyi-Tsallis' entropy? 

6 CONCLUDING COMMENTS 

Frequency moment constraints, which are the sim- 
plest of non-linear constraints, were employed in this 
work to define feasible set of types Iff for Boltzmann- 
Jaynes Inverse Problem. Non-linearity of the fre- 
quency constraints implies non-convexity of the fea- 
sible set, and together with their symmetry also non- 
uniqueness of I-projection. Moreover, because of the 
non-linearity, I-projections of q on the feasible set 



Rf do not take the canonical exponential form^. 

The non-linearity, non- convexity, non-uniqueness 
and non-exponentiality revealed limitations of sev- 
eral justifications of the REM/MaxEnt method. 
However, REM is not left completely unjustified in 
this non-traditional setup, since two justifications 
of REM are provided by Entropy Concentration 
Theorem and Maximum Probability Theorem. Thus 
though REM under frequency constraints loses two 
of its charming properties: uniqueness and exponen- 
tiality of I-projection, its application within the cor- 
responding BJIP remains justified by the two The- 
orems. One of the primary aims of this work was to 
give a general (multiple 1-projection) formulation of 
Maximum Probability Theorem and provide its il- 
lustration. At the same time the work was intended 
to serve as an invitation to the challenging world 
of non-linear constraints which shake several tradi- 
tional views of REM/MaxEnt^. 

Maximum Rcnyi/Tsallis' entropy method (max- 
Tent) was considered here mainly because of the 
non-linearity of the constraints which are used in 
Non-extensive Thermodynamics (NET). As it was 
shown (see Sect. 5), under the constraints max- 
Tent can select a distribution which is according 
to CWLLN asymptotically conditionally improba- 
ble. This finding prompts Jaynes question: What 
are adherents of maxTent accomplishing when they 
maximize Renyi-Tsallis' entropy? When it will be 
answered, maxTent could enter the tiny class of en- 
tropies for which the answer is known and which 
can thus be consciously applied for distribution se- 
lection. 
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APPENDIX 

Observe, that any of the three I-projections at the 
Example 2 (Section 4.2) has two of probabilities 
equal. This can be elucidated by the following el- 
ementary considerations: suppose that the feasible 
set is constrained further by additional requirement 
pi =P2 =P3. This additional requirement makes 
Po = [1 /3 1/3 1/3] the only pmf in the set. Clearly, 
the pmf is indeed in the set only if a = Qq = 1/3, 
ie. the 'centre of mass' of Y-i^i Pi • 0.7^ ^Lq then 
Po is not in TTf , hence the most entropic pmf should 
be searched among those pmf's which have two of 
probabilities equal; saypi =P2- 

The additional requirement turns the under-deter- 
mined conditions into a quadratic equation which is 
solved by either pi ~ 0.2131 or pi — 0.4535. Hence 
the restricted feasible set comprises two groups of 
pmf's [0.2131 0.2131 0.5737] and [0.4535 0.4535 
0.0930]. The first pmf has Shannon's entropy Hu = 
0.9777, the second Hl = 0.9381 . It does not surprise 
that pmf's from the original set Uf (ie. those which 
can have all three probabilities different) have Shan- 
non's entropy within the bounds which are set up by 
Hl and Hu. 

This is obviously, not a property specific to the 
studied example with the particular choice of a = 2 
and m = 3. In general, the finding permits to state 
the following 

Proposition Let q be uniform, TTf = {p : ^ p?' — 
a — 0,Y.Vi ^1 =0); where p G R"^ and a G Z. Let 
ra > ex. Then G TTf such that H(p) < H(^) for 
any p G TTf, is such that "^i = Pi = ■■■ = Pm.-^, 
where P^ is one of solutions of the following algebraic 
equation: 

(m-1)^f+(1-(m-1)^i)"-a = (3) 

Note: Clearly, among the pmf's which solve equa- 
tion (3), P is the one with the highest value of Shan- 
non's entropy H. Any permutation of P is also I- 
projection of u on TTf . 



BIBLIOGRAPHIC NOTE 

Literature on Tsallis' maximum entropy method is 
vast (cf. [38]). arXiv contains a series of preprints 
which document evolution of the method. Also, see 
March 2002 issue of Chaos, Solitons and Fractals. 
Interesting introductory remarks on NET can be 
found at [6]. Critical voices are rare: besides the 
fundamental [27] sec for instance also [15], [46]. This 
work draws on and corrects [19]. 

July 2002, May-July 2003 
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